moral value
Beyond Human Judgment: A Bayesian Evaluation of LLMs' Moral Values Understanding
Skorski, Maciej, Landowska, Alina
How do Large Language Models understand moral dimensions compared to humans? This first large-scale Bayesian evaluation of market-leading language models provides the answer. In contrast to prior work using deterministic ground truth (majority or inclusion rules), we model annotator disagreements to capture both aleatoric uncertainty (inherent human disagreement) and epistemic uncertainty (model domain sensitivity). We evaluated the best language models (Claude Sonnet 4, DeepSeek-V3, Llama 4 Maverick) across 250K+ annotations from nearly 700 annotators in 100K+ texts spanning social networks, news and forums. Our GPU-optimized Bayesian framework processed 1M+ model queries, revealing that AI models typically rank among the top 25\% of human annotators, performing much better than average balanced accuracy. Importantly, we find that AI produces far fewer false negatives than humans, highlighting their more sensitive moral detection capabilities.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > France (0.04)
- Europe > Poland (0.04)
- (2 more...)
- Health & Medicine (0.68)
- Information Technology > Services (0.48)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
- (2 more...)
An Anthropologist LLM to Elicit Users' Moral Preferences through Role-Play
De Ninno, Gianluca, Inverardi, Paola, Belotti, Francesca
GPT can predict users' future decisions by analyzing narrative tables, with accuracy further improved when guided by an anthropological framework. Moreover, by integrating contextual knowledge and an interpretative lens into LLMs, this approach enhances AI explainability while ensuring a human-centric perspective in requirement elicitation. By asking GPT to generate a user profile, it becomes possible to directly assess what the model has understood about the user and how it represents them. Furthermore, since the model is not only tasked with predicting users' responses in new scenarios but also with justifying its choices, it is possible, on one hand, to understand the rationale behind the model's output and, on the other, to identify potential misalignments between the model's prediction and the user's actual values and preferences. This enables targeted interventions to improve alignment between the LLM and the user profile, creating a continuous feedback loop that involves both the user and the LLM trained to interpret data through an anthropological lens. The process strengthens the model's interpretability, ethical alignment, and predictive adaptability, thereby making AI systems more transparent and attuned to real-world human values. Ultimately, the approach lays the groundwork for AI assistants capable of recognizing and adapting to individuals' soft ethics and ethical decision-making process. B. Threat to V alidity We discuss threats to validity following the qualitative research framework proposed in [72]--namely, credibility, transferability, dependability, and confirmability.
- North America > United States > Illinois > Cook County > Chicago (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (7 more...)
- Leisure & Entertainment > Games > Computer Games (1.00)
- Information Technology > Security & Privacy (1.00)
- Information Technology > Software (0.68)
- Law > Civil Rights & Constitutional Law (0.67)
One Model, Many Morals: Uncovering Cross-Linguistic Misalignments in Computational Moral Reasoning
Farid, Sualeha, Lin, Jayden, Chen, Zean, Kumar, Shivani, Jurgens, David
Large Language Models (LLMs) are increasingly deployed in multilingual and multicultural environments where moral reasoning is essential for generating ethically appropriate responses. Yet, the dominant pretraining of LLMs on English-language data raises critical concerns about their ability to generalize judgments across diverse linguistic and cultural contexts. In this work, we systematically investigate how language mediates moral decision-making in LLMs. We translate two established moral reasoning benchmarks into five culturally and typologically diverse languages, enabling multilingual zero-shot evaluation. Our analysis reveals significant inconsistencies in LLMs' moral judgments across languages, often reflecting cultural misalignment. Through a combination of carefully constructed research questions, we uncover the underlying drivers of these disparities, ranging from disagreements to reasoning strategies employed by LLMs. Finally, through a case study, we link the role of pretraining data in shaping an LLM's moral compass. Through this work, we distill our insights into a structured typology of moral reasoning errors that calls for more culturally-aware AI.
- Europe > Austria > Vienna (0.14)
- Asia > Middle East > Jordan (0.04)
- South America (0.04)
- (6 more...)
Beyond Ethical Alignment: Evaluating LLMs as Artificial Moral Assistants
Galatolo, Alessio, Rappuoli, Luca Alberto, Winkle, Katie, Beloucif, Meriem
The recent rise in popularity of large language models (LLMs) has prompted considerable concerns about their moral capabilities. Although considerable effort has been dedicated to aligning LLMs with human moral values, existing benchmarks and evaluations remain largely superficial, typically measuring alignment based on final ethical verdicts rather than explicit moral reasoning. In response, this paper aims to advance the investigation of LLMs' moral capabilities by examining their capacity to function as Artificial Moral Assistants (AMAs), systems envisioned in the philosophical literature to support human moral deliberation. We assert that qualifying as an AMA requires more than what state-of-the-art alignment techniques aim to achieve: not only must AMAs be able to discern ethically problematic situations, they should also be able to actively reason about them, navigating between conflicting values outside of those embedded in the alignment phase. Building on existing philosophical literature, we begin by designing a new formal framework of the specific kind of behaviour an AMA should exhibit, individu-ating key qualities such as deductive and abductive moral reasoning. Drawing on this theoretical framework, we develop a benchmark to test these qualities and evaluate popular open LLMs against it. Our results reveal considerable variability across models and highlight persistent shortcomings, particularly regarding abductive moral reasoning. Our work connects theoretical philosophy with practical AI evaluation while also emphasising the need for dedicated strategies to explicitly enhance moral reasoning capabilities in LLMs.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Sweden > Uppsala County > Uppsala (0.04)
The Pluralistic Moral Gap: Understanding Judgment and Value Differences between Humans and Large Language Models
Russo, Giuseppe, Nozza, Debora, Röttger, Paul, Hovy, Dirk
People increasingly rely on Large Language Models (LLMs) for moral advice, which may influence humans' decisions. Yet, little is known about how closely LLMs align with human moral judgments. To address this, we introduce the Moral Dilemma Dataset, a benchmark of 1,618 real-world moral dilemmas paired with a distribution of human moral judgments consisting of a binary evaluation and a free-text rationale. We treat this problem as a pluralistic distributional alignment task, comparing the distributions of LLM and human judgments across dilemmas. We find that models reproduce human judgments only under high consensus; alignment deteriorates sharply when human disagreement increases. In parallel, using a 60-value taxonomy built from 3,783 value expressions extracted from rationales, we show that LLMs rely on a narrower set of moral values than humans. These findings reveal a pluralistic moral gap: a mismatch in both the distribution and diversity of values expressed. To close this gap, we introduce Dynamic Moral Profiling (DMP), a Dirichlet-based sampling method that conditions model outputs on human-derived value profiles. DMP improves alignment by 64.3% and enhances value diversity, offering a step toward more pluralistic and human-aligned moral guidance from LLMs.
- North America > United States (0.14)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine (0.68)
- Education (0.46)
Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations
Xiang, Chaoyi, Liu, Chunhua, De Deyne, Simon, Frermann, Lea
As the impact of large language models increases, understanding the moral values they reflect becomes ever more important. Assessing the nature of moral values as understood by these models via direct prompting is challenging due to potential leakage of human norms into model training data, and their sensitivity to prompt formulation. Instead, we propose to use word associations, which have been shown to reflect moral reasoning in humans, as low-level underlying representations to obtain a more robust picture of LLMs' moral reasoning. We study moral differences in associations from western English-speaking communities and LLMs trained predominantly on English data. First, we create a large dataset of LLM-generated word associations, resembling an existing data set of human word associations. Next, we propose a novel method to propagate moral values based on seed words derived from Moral Foundation Theory through the human and LLM-generated association graphs. Finally, we compare the resulting moral conceptualizations, highlighting detailed but systematic differences between moral values emerging from English speakers and LLM associations.
- Oceania > Australia (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- (5 more...)
LLMs as mirrors of societal moral standards: reflection of cultural divergence and agreement across ethical topics
Meijer, Mijntje, Mohammadi, Hadi, Bagheri, Ayoub
Large language models (LLMs) have become increasingly pivotal in various domains due the recent advancements in their performance capabilities. However, concerns persist regarding biases in LLMs, including gender, racial, and cultural biases derived from their training data. These biases raise critical questions about the ethical deployment and societal impact of LLMs. Acknowledging these concerns, this study investigates whether LLMs accurately reflect cross-cultural variations and similarities in moral perspectives. In assessing whether the chosen LLMs capture patterns of divergence and agreement on moral topics across cultures, three main methods are employed: (1) comparison of model-generated and survey-based moral score variances, (2) cluster alignment analysis to evaluate the correspondence between country clusters derived from model-generated moral scores and those derived from survey data, and (3) probing LLMs with direct comparative prompts. All three methods involve the use of systematic prompts and token pairs designed to assess how well LLMs understand and reflect cultural variations in moral attitudes. The findings of this study indicate overall variable and low performance in reflecting cross-cultural differences and similarities in moral values across the models tested, highlighting the necessity for improving models' accuracy in capturing these nuances effectively. The insights gained from this study aim to inform discussions on the ethical development and deployment of LLMs in global contexts, emphasizing the importance of mitigating biases and promoting fair representation across diverse cultural perspectives.
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Virginia (0.04)
- Europe > Netherlands > Zeeland (0.04)
- Europe > Austria > Vienna (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (0.93)
Explainable Moral Values: a neuro-symbolic approach to value classification
Lazzari, Nicolas, De Giorgis, Stefano, Gangemi, Aldo, Presutti, Valentina
This work explores the integration of ontology-based reasoning and Machine Learning techniques for explainable value classification. By relying on an ontological formalization of moral values as in the Moral Foundations Theory, relying on the DnS Ontology Design Pattern, the \textit{sandra} neuro-symbolic reasoner is used to infer values (fomalized as descriptions) that are \emph{satisfied by} a certain sentence. Sentences, alongside their structured representation, are automatically generated using an open-source Large Language Model. The inferred descriptions are used to automatically detect the value associated with a sentence. We show that only relying on the reasoner's inference results in explainable classification comparable to other more complex approaches. We show that combining the reasoner's inferences with distributional semantics methods largely outperforms all the baselines, including complex models based on neural network architectures. Finally, we build a visualization tool to explore the potential of theory-based values classification, which is publicly available at http://xmv.geomeaning.com/.
- Europe > Italy > Emilia-Romagna > Metropolitan City of Bologna > Bologna (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- North America > United States > Pennsylvania (0.04)
- (5 more...)
Adaptable Moral Stances of Large Language Models on Sexist Content: Implications for Society and Gender Discourse
Guo, Rongchen, Nejadgholi, Isar, Dawkins, Hillary, Fraser, Kathleen C., Kiritchenko, Svetlana
This work provides an explanatory view of how LLMs can apply moral reasoning to both criticize and defend sexist language. We assessed eight large language models, all of which demonstrated the capability to provide explanations grounded in varying moral perspectives for both critiquing and endorsing views that reflect sexist assumptions. With both human and automatic evaluation, we show that all eight models produce comprehensible and contextually relevant text, which is helpful in understanding diverse views on how sexism is perceived. Also, through analysis of moral foundations cited by LLMs in their arguments, we uncover the diverse ideological perspectives in models' outputs, with some models aligning more with progressive or conservative views on gender roles and sexism. Based on our observations, we caution against the potential misuse of LLMs to justify sexist language. We also highlight that LLMs can serve as tools for understanding the roots of sexist beliefs and designing well-informed interventions. Given this dual capacity, it is crucial to monitor LLMs and design safety mechanisms for their use in applications that involve sensitive societal topics, such as sexism.
- North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Singapore (0.04)
- (4 more...)
A Survey on Moral Foundation Theory and Pre-Trained Language Models: Current Advances and Challenges
Zangari, Lorenzo, Greco, Candida M., Picca, Davide, Tagarelli, Andrea
Moral values have deep roots in early civilizations, codified within norms and laws that regulated societal order and the common good. They play a crucial role in understanding the psychological basis of human behavior and cultural orientation. The Moral Foundation Theory (MFT) is a well-established framework that identifies the core moral foundations underlying the manner in which different cultures shape individual and social lives. Recent advancements in natural language processing, particularly Pre-trained Language Models (PLMs), have enabled the extraction and analysis of moral dimensions from textual data. This survey presents a comprehensive review of MFT-informed PLMs, providing an analysis of moral tendencies in PLMs and their application in the context of the MFT. We also review relevant datasets and lexicons and discuss trends, limitations, and future directions. By providing a structured overview of the intersection between PLMs and MFT, this work bridges moral psychology insights within the realm of PLMs, paving the way for further research and development in creating morally aware AI systems.
- North America > United States > Washington > King County > Seattle (0.14)
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Europe > France (0.14)
- (22 more...)
- Overview (1.00)
- Research Report > New Finding (0.46)
- Health & Medicine > Therapeutic Area > Immunology (1.00)
- Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)
- Law (0.92)
- (2 more...)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
- Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)